Skip to content

[2/4] Add DeviceStatsMonitor callback #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 55 commits into from

Conversation

daniellepintz
Copy link
Owner

@daniellepintz daniellepintz commented Sep 17, 2021

What does this PR do?

Adds DeviceStatsMonitor callback which calls the get_device_stats function in Accelerator interface. DeviceStatsMonitor will replace GPUStatsMonitor and XLAStatsMonitor

Fixes Lightning-AI#9032

Does your PR introduce any breaking changes? If yes, please list them.

Before submitting

  • Was this discussed/approved via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you list all the breaking changes introduced by this pull request?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

@daniellepintz daniellepintz changed the title [2/n] Add DeviceStatsMonitor callback [2/4] Add DeviceStatsMonitor callback Sep 18, 2021
daniellepintz and others added 22 commits September 18, 2021 10:05
…tly into `trainer.py` (Lightning-AI#9495)

Co-authored-by: Adrian Wälchli <[email protected]>
Co-authored-by: thomas chaton <[email protected]>
* deprecate loggerbase.close

* deprecate warning

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add to changelog

* fix import

* fix import alphabetize

* spacing?

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* copy-paste avoid pre-commit.ci?

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* literally match the other comment

* unindent

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* suggest finalize instead of save

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update tests/loggers/test_base.py

* format but to be formatted

* Update pytorch_lightning/loggers/base.py

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update pytorch_lightning/loggers/base.py

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update pytorch_lightning/loggers/base.py

Co-authored-by: Jirka Borovec <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ananthsub <[email protected]>
Co-authored-by: Adrian Wälchli <[email protected]>
Co-authored-by: thomas chaton <[email protected]>
Co-authored-by: Jirka Borovec <[email protected]>
* Fix gradient accumulation for `ShardedDataParallel`

* Update changelog

* Update pytorch_lightning/plugins/training_type/sharded.py

* add test

* Update test_sharded_plugin.py

* Update test_sharded_plugin.py

* Update test_sharded_plugin.py
awaelchli and others added 27 commits September 23, 2021 09:52
Co-authored-by: Carlos Mocholí <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <[email protected]>
Co-authored-by: Adrian Wälchli <[email protected]>
Co-authored-by: thomas chaton <[email protected]>
Co-authored-by: Carlos Mocholí <[email protected]>
Co-authored-by: Adrian Wälchli <[email protected]>
Co-authored-by: Carlos Mocholi <[email protected]>
* wip

* reset _notebooks

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* reset _notebooks

* testing with mock

* update test with mock

* update test

* update tests

* update test

* remove track_load_dataloader_calls

* update last test

* remove unused imports

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* fix accumuate_grad_batches on init

* chlog

* update error

* move to callback connector

* add test with callback

* fix tests

* Update pytorch_lightning/trainer/connectors/callback_connector.py

Co-authored-by: Carlos Mocholí <[email protected]>

* update ipu logic

* rev

* rev

* rev

* pls work

* code review

Co-authored-by: Rohit Gupta <[email protected]>
Co-authored-by: Carlos Mocholí <[email protected]>
…ck (Lightning-AI#9615)

* pt1 dir empty check

* clean imports

* bring back resolve mkdir?

* original doc

* warningcache

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* cp callback after resolve

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move global_zero check outside warn fn

Co-authored-by: ananthsub <[email protected]>

* move global_zero check outside warn fn 2

Co-authored-by: ananthsub <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ananthsub <[email protected]>
Lightning-AI#9691)

* Deprecate LightningDistributed and keep logic in ddp/ddpSpawn directly

* Deprecate LightningDistributed and keep logic in ddp/ddpSpawn directly

* Deprecate LightningDistributed and keep logic in ddp/ddpSpawn directly

* Deprecate LightningDistributed and keep logic in ddp/ddpSpawn directly

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <[email protected]>

* Apply suggestions from code review

Co-authored-by: ananthsub <[email protected]>

* Deprecate LightningDistributed and keep logic in ddp/ddpSpawn directly

* Update pytorch_lightning/distributed/dist.py

Co-authored-by: ananthsub <[email protected]>

* Deprecate LightningDistributed and keep logic in ddp/ddpSpawn directly

* Apply suggestions from code review

Co-authored-by: ananthsub <[email protected]>

* Apply suggestions from code review

* Apply suggestions from code review

* Deprecate LightningDistributed and keep logic in ddp/ddpSpawn directly

Co-authored-by: Adrian Wälchli <[email protected]>
Co-authored-by: ananthsub <[email protected]>
* update warnings

* add tests

* comments

* Apply suggestions from code review

* Apply suggestions from code review
…htning-AI#8989)

* Deprecate `stochastic_weight_avg` from the `Trainer` constructor

* Update CHANGELOG.md

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <[email protected]>

Co-authored-by: Adrian Wälchli <[email protected]>
Co-authored-by: Kaushik B <[email protected]>
Co-authored-by: Jirka Borovec <[email protected]>
Co-authored-by: Rohit Gupta <[email protected]>
@daniellepintz
Copy link
Owner Author

Closing in favor of Lightning-AI#9712

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.